R Advanced Course

Advanced spatial and temporal statistics

Objective and contents

Objective and contents

Objective: to perform multiple spatio-temporal analyses

  • Review of basic spatial and temporal statistics
  • Data extraction over areas of interest
  • Statistical trends analyses

Review of basic spatial and temporal statistics

Basic spatial and temporal statistics

Here is a summary of the most important functions to work with spatial data:

Function (terra) package description
rast Create a SpatRaster (one or multiple layers)
vect Create a SpatVector
crop Select a geographic subset of a SpatRaster
mask Replace values in a SpatRaster based on values in another SpatRaster
resample Resample (warp) values to a SpatRaster with a different origin and/or resolution
merge Combine SpatRasters with different extents (but same origin and resolution)
mosaic Combine SpatRasters with different extents using a function for overlapping cells

Basic spatial and temporal statistics

Function (terra) package description
add Add (in place) a SpatRaster to another SpatRaster object
global Compute global statistics
disagg Subdivide cells
aggregate Combine cells of a SpatRaster to create larger cells
app Apply a function to the values of each cell of a SpatRaster
classify (Re-)classify values
t Transpose a SpatRaster
project Project (warp) values to a SpatRaster with a different coordinate reference system
extract Extract grid-cells from the SpatRaster using a vector object

Exercise

Let’s crop and mask a raster file according to a shapefile. First we can import the data using the terra package.

shape <- "C:/catchment_boundaries/Trancura_catchment.shp"
rst   <- "C:/CHIRPS_monthly/chirps_monthly2000-01.tif"

shape <- vect(shape)
rst   <- rast(rst)

Importing data

Now, we can plot the raster layer.

plot(rst, main = "Precipitation for 2000-01")

Cropping and masking

Use the crop and mask functions:

rst <- crop(rst, shape, snap = "out")
rst <- mask(rst, shape)
plot(rst)
lines(shape, col = "red", lwd = 2)

Cropping and masking

Increasing spatial resolution

We can use the disagg function to disaggregate the raster using a bilinear interpolation.

rst <- disagg(rst, 5, method = "bilinear")
rst <- mask(rst, shape)
plot(rst)
lines(shape, col = "red", lwd = 2)

Increasing spatial resolution

We can use the disagg function to disaggregate the raster using a bilinear interpolation.

Global statistics

The global function will allow us to apply global statistics to the raster layer.

global(rst, fun = max, na.rm = TRUE)
##                            max
## chirps_monthly2000-01 59.09822
global(rst, fun = min, na.rm = TRUE)
##                            min
## chirps_monthly2000-01 24.17489
global(rst, fun = range, na.rm = TRUE)
##                             X1       X2
## chirps_monthly2000-01 24.17489 59.09822
global(rst, fun = sd, na.rm = TRUE)
##                             sd
## chirps_monthly2000-01 7.416875

Reclassifying a raster layer

We can use the classify function to reclassify the values of the raster. Lets locate the areas with precipitation higher that 50 mm.

rcl  <- matrix(c(0, 50, 0, 50, 100, 1), byrow = TRUE, ncol = 3)
clas <- classify(rst, rcl)
print(clas)
## class       : SpatRaster 
## dimensions  : 70, 55, 1  (nrow, ncol, nlyr)
## resolution  : 0.01, 0.01  (x, y)
## extent      : -71.9, -71.35, -39.65, -38.95  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326) 
## source      : memory 
## name        : chirps_monthly2000-01 
## min value   :                     0 
## max value   :                     1

Reclassifying a raster layer

plot(clas, main = "P > 50 mm")

Data extraction over areas of interest

Data extraction over areas of interest

We can extract data from a selected catchment with the extract function. Let’s load all the raster layers from CHIRPSv2.

chirps <- "C:/Data/L2_Spatiotemporal_Statistics/CHIRPS_monthly"
chirps <- list.files(chirps, full.names = TRUE)
chirps <- rast(chirps)

Data extraction over areas of interest

Remember that we have already imported the shapefile of Trancura, so we can use the extract function to derive the mean value over each sub-catchment.

extracted_p <- t(terra::extract(chirps, shape, fun = mean))
extracted_p <- data.frame(extracted_p[-1,])
names(extracted_p) <- paste0("Catch_", 1:9)
head(extracted_p)
##                         Catch_1   Catch_2   Catch_3   Catch_4   Catch_5
## chirps_monthly2000.01  32.64311  30.25284  38.26172  48.81274  35.91513
## chirps_monthly2000.02 124.29573 124.33568 150.27121 164.53529 153.07471
## chirps_monthly2000.03  73.91442  84.02276  84.01890 137.00136  87.24352
## chirps_monthly2000.04 189.26642 180.37294 172.24188 182.42945 170.55661
## chirps_monthly2000.05 243.18256 248.56244 210.59882 256.72011 215.20582
## chirps_monthly2000.06 782.75869 841.81577 800.33160 868.29697 782.34115
##                         Catch_6   Catch_7   Catch_8   Catch_9
## chirps_monthly2000.01  42.56706  39.50981  36.47252  42.47352
## chirps_monthly2000.02 149.23172 120.45499 116.69346 118.91115
## chirps_monthly2000.03 122.30995  76.58796  74.26831  99.99020
## chirps_monthly2000.04 180.12965 179.48244 182.60414 180.43142
## chirps_monthly2000.05 240.73603 221.35588 205.27945 217.91149
## chirps_monthly2000.06 860.88827 829.38799 831.21078 805.42748
tail(extracted_p)
##                         Catch_1   Catch_2   Catch_3   Catch_4   Catch_5
## chirps_monthly2010.07 170.12679 173.92264 188.98456 230.49674 202.81184
## chirps_monthly2010.08 282.73925 309.65331 318.32141 416.14959 330.34954
## chirps_monthly2010.09  86.69548  71.53093  67.87955  78.87533  71.92223
## chirps_monthly2010.10 104.21935  83.50422  98.33887 111.21490  96.25324
## chirps_monthly2010.11 121.02811 111.46140 104.19596 152.06571  99.05009
## chirps_monthly2010.12 107.64020 102.22077 102.94524 168.42114 106.47013
##                         Catch_6   Catch_7   Catch_8   Catch_9
## chirps_monthly2010.07 205.50407 190.91494 184.33262 170.95299
## chirps_monthly2010.08 362.74336 266.60689 284.61003 328.65600
## chirps_monthly2010.09  73.45766  64.59489  66.67543  84.76546
## chirps_monthly2010.10 125.25329 103.88051  94.91428 130.00015
## chirps_monthly2010.11 167.71416 126.23355 122.48896 167.91479
## chirps_monthly2010.12 152.04451 110.47787 107.45323 159.51737

Data extraction over areas of interest

Now we can create a vector of dates and transform the object to zoo.

dates <- mip("2000-01-01", "2010-12-01")
extracted_p <- zoo(extracted_p, dates)
plot(extracted_p)

Data extraction over areas of interest

Temporal aggregation

We can convert the monthly values to annual values:

annual_p <- monthly2annual(extracted_p, FUN = sum)
plot(annual_p)

Temporal aggregation

Analysing the seasonality

We want to assess the difference between the mean maximum, mean, and minimum values over an area.

min_p  <- t(terra::extract(chirps, shape[1,], fun = min))
mean_p <- t(terra::extract(chirps, shape[1,], fun = mean))
max_p  <- t(terra::extract(chirps, shape[1,], fun = max))
res    <- data.frame(min = min_p[-1,1], mean = mean_p[-1,1], max = max_p[-1,1])
res    <- zoo(res, dates)
res    <- monthlyfunction(res, FUN = mean)
head(res)
##           Jan      Feb       Mar      Apr      May      Jun      Jul      Aug
## min  31.64041 36.39100  77.97065 123.4501 252.5469 357.7314 264.4958 228.2694
## mean 39.60057 45.48071 112.16043 142.5444 304.3520 395.3107 309.2674 289.4653
## max  60.95700 60.09460 181.23160 164.3283 384.5137 437.9277 359.5295 397.1406
##           Sep      Oct      Nov       Dec
## min  123.5629 114.9475 116.1880  76.54231
## mean 155.3641 144.5790 159.7892  97.86936
## max  195.5086 188.8239 217.2832 150.54336

Analysing the range ov values from a shapefile

plot(res[3,], type = "l", col = "red", xaxt = "n", ylab = "P[mm]", xlab = "Month",
     main = "Precipitation seasonality")
lines(res[2,], col = "black", lwd = 2)
lines(res[1,], col = "cyan")
legend("topright", c("MAX", "MEAN", "MIN"), lwd = c(1, 2, 1), 
       col = c("red", "black", "cyan"), bty = "n")
axis(1, at = 1:length(res[1,]), labels = month.abb)
grid()

Extracting all gridcells from a region

Now, we want to extract all cells within a region.

all <- t(terra::extract(chirps, shape[1,]))
all <- zoo(all[-1,], dates)
dim(all)
## [1] 132  22
head(all)
##                                                                                
## 2000-01-01  33.57729  37.75935  50.04085  50.48072  32.4786  30.55647  27.88526
## 2000-02-01 119.71941 132.78798 139.45903 117.18154 122.3020 112.37872 101.88407
## 2000-03-01 124.95280 110.19607 101.34157  82.94258  82.8429  74.71009  67.16457
## 2000-04-01 199.24543 208.54238 214.17587 181.86358 194.7734 194.21232 172.04716
## 2000-05-01 357.47462 314.03882 292.16184 273.60090 285.5774 254.89298 262.37548
## 2000-06-01 888.61125 864.39546 852.20703 792.10379 790.9331 794.26123 751.80734
##                                                                       
## 2000-01-01  32.23194  30.71408  28.06703  29.30369  27.58900  26.96193
## 2000-02-01 158.48489 148.82877 130.58752 148.01157 141.26768 130.25223
## 2000-03-01  83.87853  73.91518  64.79342  76.60091  64.85014  54.76036
## 2000-04-01 196.86121 199.94566 179.94651 183.82901 186.41501 174.78938
## 2000-05-01 264.90122 268.02260 246.06775 239.26090 227.78334 228.81692
## 2000-06-01 789.35866 810.10215 738.91630 747.79556 733.14226 694.96256
##                                                                       
## 2000-01-01  25.10713  30.56313  29.65025  28.00304  35.25353  34.11623
## 2000-02-01 126.80638 119.29774 103.25973  83.93491 116.42269 123.12563
## 2000-03-01  58.98059  75.39612  56.01731  52.21913  69.35712  57.37520
## 2000-04-01 189.53912 188.45879 184.41832 185.97168 184.41682 175.95889
## 2000-05-01 232.79722 235.45928 206.10833 224.01267 191.21100 185.62857
## 2000-06-01 731.91219 783.41621 748.73996 748.40588 830.60839 790.24056
##                                         
## 2000-01-01  33.12627  31.19857  33.48402
## 2000-02-01 121.24088 113.13896 124.13361
## 2000-03-01  59.23088  64.41759  70.17417
## 2000-04-01 186.47748 186.35213 195.62099
## 2000-05-01 186.00729 184.27336 189.54375
## 2000-06-01 773.76764 765.61688 799.38681

Extracting all gridcells from a region

Similarly, we can apply the monthly function to assess the dispersion of the precipitation values related to each month.

all <- monthlyfunction(all, FUN = mean)
head(all)
##         Jan      Feb      Mar      Apr      May      Jun      Jul      Aug
## V1 39.65565 50.87448 181.2316 147.9602 384.0746 433.1619 355.2262 397.1406
## V2 45.24225 50.79891 153.7298 159.7811 361.3362 424.5012 351.3820 360.2741
## V3 60.42500 58.09349 155.8524 155.5031 349.9457 419.8021 338.8527 347.9268
## V4 60.75074 51.49272 122.9051 132.9898 312.0844 388.4880 306.4354 311.2739
## V5 41.53849 46.49209 125.6721 148.5062 328.4876 405.0725 327.7640 315.9171
## V6 40.06596 46.48656 117.5933 138.9663 328.3735 399.6962 317.3586 300.8823
##         Sep      Oct      Nov      Dec
## V1 187.5400 185.5124 199.0732 126.5307
## V2 181.6526 172.9633 193.2174 127.8684
## V3 189.5458 186.3385 206.0154 150.5434
## V4 169.9558 169.8260 216.2802 124.9114
## V5 170.8939 155.7769 181.3202 104.3300
## V6 179.0379 156.9418 170.9045 100.5878

Extracting all gridcells from a region

Finally we can plot the values.

boxplot(all, col = "cyan", xaxt = "n", ylab = "P[mm]", xlab = "Month")
axis(1, at = 1:length(res[1,]), labels = month.abb)
grid()

Statistical trend analysis

Statistical trend analysis

We will evaluate the annual trends in precipitation as performed in Gebrechorkos et al., (2019) using monthly CHIRPSv2 data. For this purpose we have to crop the product to the study area.

Loading gridded data

First, we will load the annual CHIRPSv2 values.

chirps <- "C:/Data/L2_Spatiotemporal_Statistics/CHIRPS_annual"
chirps <- list.files(chirps, full.names = TRUE)
chirps <- rast(chirps)
print(chirps)
## class       : SpatRaster 
## dimensions  : 2000, 7200, 38  (nrow, ncol, nlyr)
## resolution  : 0.05, 0.05  (x, y)
## extent      : -180, 180, -50, 50  (xmin, xmax, ymin, ymax)
## coord. ref. : lon/lat WGS 84 (EPSG:4326) 
## sources     : chirps_annual1981.tif  
##               chirps_annual1982.tif  
##               chirps_annual1983.tif  
##               ... and 35 more source(s)
## names       :       sum,       sum,       sum,       sum,       sum,       sum, ... 
## min values  :         0,         0,         0,         0,         0,         0, ... 
## max values  :  9338.219,  7800.628,  9589.769,  9135.982,  9061.747,  9574.310, ...

Loading vector data

For this exercise, we will use the shapefile of the Nile Basin Countries

shape <- "C:/Countries/Nile_Basin_Countries_GAUL2014_2.shp"
shape <- vect(shape)
shape <- shape[7,]
print(shape)
##  class       : SpatVector 
##  geometry    : polygons 
##  dimensions  : 1, 9  (geometries, attributes)
##  extent      : 32.99994, 47.98618, 3.401109, 14.89416  (xmin, xmax, ymin, ymax)
##  coord. ref. : lon/lat WGS 84 (EPSG:4326) 
##  names       :       STATUS DISP_AREA ADM0_CODE ADM0_NAME STR0_YEAR EXP0_YEAR
##  type        :        <chr>     <chr>     <int>     <chr>     <int>     <int>
##  values      : Member State        NO        79  Ethiopia      1000      3000
##  Shape_Leng Shape_Area Name_label
##       <num>      <num>      <chr>
##       50.38      92.87   Ethiopia

Cropping and masking vector data

Now we will crop and mask the CHIRPS v2 product to the study area extent.

chirps <- crop(chirps, shape, snap = "out")
chirps <- mask(chirps, shape)
plot(chirps)

Man-Kendall trend test

It is a nonparametric rank-based statistical technique that is widely used to evaluate trends in time series.

  • The null hypothesis for this test is that there is no monotonic trend in the series.

  • The alternate hypothesis is that a trend exists.

  • This trend can be positive, negative, or non-null.

The change is assumed to be statistically significant at a probability level of 5%.

The Sen’s slope is used to calculate the magnitude of trends in long-term hydroclimatic data in slope magnitude per year. To apply the Man-Kendall trend test we will require the trend package.

Man-Kendall trend test

We will create a function that returns the p value of the Man-Kendall test:

get_pval <- function(x){
  
    if(!any(is.na(x))){
      
      mk <- mk.test(x)
      pval <- mk$p.value
    
  } else {
    pval <- NA
  }
  
  return(pval)
  
}

Obtaining significance

We can apply the function that we created.

pvals_chirps <- app(chirps, get_pval)
plot(pvals_chirps)

Obtaining the slope

The slope of the regression line (β) determines the change in y over the change in time (t). The magnitude of change in a time series is computed by using the Sen’s slope estimator (\(\beta\)). The magnitude of the slope is better represented by the Sen’s slope estimator because it detects better the linear relationships as is not affected by outliers.

Obtaining the slope

We can write a function to obtain the slope of the linear regression.

get_slope <- function(x, conf.level = 0.95, n.years = nlyr(chirps)){
  
    if(!any(is.na(x))){
      
      ss    <- sens.slope(x, conf.level = conf.level)
      slope <- ss$estimates * n.years
    
  } else {
    slope <- NA
  }
  
  return(slope)
  
}

Obtaining slope of the model

We can apply the function that we created.

slope_chirps <- app(chirps, get_slope)
plot(slope_chirps)

Analysing the results

Where do we have significant changes?

# Masking changes
sig_changes <- pvals_chirps
sig_changes[sig_changes > 0.05] <- NA
sig_changes[sig_changes > 0]    <- 1

# Adding direction of changes
unit_changes <- slope_chirps / abs(slope_chirps)
sig_changes  <- unit_changes * sig_changes

plot(sig_changes)
lines(shape)

Analysing the results

Where do we have significant changes?

Analysing the results

Of how much?

sig_changes <- abs(sig_changes)
slope_chirps <- slope_chirps * sig_changes

plot(sig_changes)
lines(shape)

Analysing the results

Of how much?